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Abstract. The restricted isometry property (RIP) is a well-known matrix condition that 
provides state-of-the-art reconstruction guarantees for compressed sensing. While random 
matrices are known to satisfy this property with high probability, deterministic constructions 
have found less success. In this paper, we consider various techniques for demonstrating 
RIP deterministically, some popular and some novel, and we evaluate their performance. In 
evaluating some techniques, we apply random matrix theory and inadvertently find a simple 
alternative proof that certain random matrices are RIP. Later, we propose a particular class 
of matrices as candidates for being RIP, namely, equiangular tight frames (ETFs). Using 
the known correspondence between real ETFs and strongly regular graphs, we investigate 
certain combinatorial implications of a real ETF being RIP. Specifically, we give probabilistic 
intuition for a new bound on the clique number of Paley graphs of prime order, and we 
conjecture that the corresponding ETFs are RIP in a manner similar to random matrices. 



1. Introduction 

Let X be an unknown A^-dimensional vector with the property that at most K of its entries 
are nonzero, that is, x is K-sparse. The goal of compressed sensing is to construct relatively 
few non-adaptive linear measurements along with a stable and efficient reconstruction al- 
gorithm that exploits this sparsity structure. Expressing each measurement as a row of an 
M X N matrix $, we have the following noisy system: 

y = ^x + z. (1) 

In the spirit of compressed sensing, we only want a few measurements: M <^ N. Also, in 
order for there to exist an inversion process for ([1]), $ must map i^-sparse vectors injec- 
tively, or equivalently, every subcollection of 2K columns of $ must be linearly independent. 
Unfortunately, the natural reconstruction method in this general case, i.e., finding the spars- 
est approximation of y from the dictionary of columns of $, is known to be NP-hard |21j . 
Moreover, the independence requirement does not impose any sort of dissimilarity between 
columns of $, meaning distinct identity basis elements could lead to similar measurements, 
thereby bringing instability in reconstruction. 
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To get around the NP-hardness of sparse approximation, we need more structure in the 
matrix $. Instead of considering hnear independence of all sub collections of 2K columns, it 
has become common to impose a much stronger requirement: that every submatrix of 2K 
columns of $ be well-conditioned. To be explicit, we have the following definition: 

Definition 1. The matrix $ has the (fC, 5)-restricted isometry property (RIP) if 

{l-6)\\xf < ||<I>xf < {l + 6)\\xf 

for every K-sparse vector x. The smallest 6 for which $ is {K,S)-RIP is the restricted 
isometry constant (RIC) 6k- 

In words, matrices which satisfy RIP act as a near-isometry on sufficiently sparse vectors. 
Note that a {2K, 5)-RIP matrix with 6 < 1 necessarily has that all subcollections of 2K 
columns are linearly independent. Also, the well- conditioning requirement of RIP forces 
dissimilarity in the columns of $ to provide stability in reconstruction. Most importantly, 
the additional structure of RIP allows for the possibility of getting around the NP-hardness 
of sparse approximation. Indeed, a significant result in compressed sensing is that RIP 
sensing matrices enable efficient reconstruction: 

Theorem 2 (Theorem 1.3 in [8]). Suppose an M x N matrix $ has the {2K, 6) -restricted 
isometry property for some 6 < \/2 — 1. Assuming \\z\\ < e, then for every K-sparse vector 
X e M^, the following reconstruction from ([T]).- 

X = argmin ||x||i s.t. \\y — ^x\\ < e 

satisfies — x|| < Ce, where C only depends on 5. 

The fact that RIP sensing matrices convert an NP-hard reconstruction problem into an 
£i-minimization problem has prompted many in the community to construct RIP matrices. 
Among these constructions, the most successful have been random matrices, such as matrices 
with independent Gaussian or Bernoulli entries [1], or matrices whose rows were randomly 
selected from the discrete Fourier transform matrix [25] . With high probability, these random 
constructions support sparsity levels K on the order of ^^^l ^ for some a > 1. Intuitively, 
this level of sparsity is near-optimal because K cannot exceed ^ by the linear independence 
condition. Unfortunately, it is difficult to check whether a particular instance of a random 
matrix is {K., 5)-RIP, as this involves the calculation of singular values for all (^) submatrices 
of K columns of the matrix. For this reason, and for the sake of reliable sensing standards, 
many have become interested in finding deterministic RIP matrix constructions. 

In the next section, we review the well-understood techniques that are commonly used to 
analyze the restricted isometry of deterministic constructions: the Gershgorin circle theorem, 
and the spark of a matrix. Unfortunately, neither technique demonstrates RIP for sparsity 
levels as large as what random constructions are known to support; rather, with these tech- 
niques, a deterministic M x N matrix $ can only be shown to have RIP for sparity levels on 
the order of \/M. This limitation has become known as the "square-root bottleneck," and 
it poses an important problem in matrix design [5U] . 

To date, the only deterministic construction that manages to go beyond this bottleneck 
is given by Bourgain et al. in Section 3, we discuss what they call flat RIP, which is 
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the technique they use to demonstrate RIP. It is important to stress the significance of 
their contribution: Before it was unclear how deterministic analysis might break the 
bottleneck, and as such, their result is a major theoretical achievement. On the other 
hand, their improvement over the square-root bottleneck is notably slight compared to what 
random matrices provide. However, by our Theorem [TU their technique can actually be 
used to demonstrate RIP for sparsity levels much larger than -\/M, meaning one could very 
well demonstrate random-like performance given the proper construction. Our result applies 
their technique to random matrices, and it inadvertently serves as a simple alternative proof 
that certain random matrices are RIP. In Section 4, we introduce an alternate technique, 
which by our Theorem [T71 can also demonstrate RIP for large sparsity levels. 

After considering the efficacy of these techniques to demonstrate RIP, it remains to find 
a deterministic construction that is amenable to analysis. To this end, we discuss various 
properties of a particularly nice matrix which comes from frame theory, called an equiangular 
tight frame (ETF). Specifically, real ETFs can be characterized in terms of their Gram 
matrices using strongly regular graphs [32]. By applying the techniques of Sections 3 and 4 
to real ETFs, we derive equivalent combinatorial statements in graph theory. By focussing 
on the ETFs which correspond to Paley graphs of prime order, we are able to make important 
statements about their clique numbers and provide some intuition for an open problem in 
number theory. We conclude by conjecturing that the Paley ETFs are RIP in a manner 
similar to random matrices. 



2. Well-understood techniques 

2.1. Applying Gershgorin's circle thoerem. Take an M x matrix $. For a given K, 
we wish to find some 5 for which $ is {K, (5)-RIP. To this end, it is useful to consider the 
following expression for the restricted isometry constant: 

5k = max \\^*^^^-Ik\\2- (2) 

\K\=K 

Here, denotes the submatrix consisting of columns of $ indexed by /C. Note that we 
are not tasked with actually computing 5k'i rather, we recognize that $ is {K, (5)-RIP for 
every 5 > 5k, and so we seek an upper bound on 5k- The following classical result offers a 
particularly easy-to-calculate bound on eigenvalues: 

Theorem 3 (Gershgorin circle theorem [H]). For each eigenvalue A of a K x K matrix A, 
there is an index i G {1, . . . , K} such that 

K 

X-A[i,i]\ <J2\^[hj] ■ 
i=i 

To use this theorem, take some $ with unit-norm columns. Note that ^Jc'^/c 
Gram matrix of the columns indexed by /C, and as such, the diagonal entries are 1, and 
the off-diagonal entries are inner products between distinct columns of $. Let /i denote the 
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worst- case coherence of $ = at]: 

/i := max 

z,jG{l,...,N} 

Then the size of each off-diagonal entry of ^x:^k — f^y regardless of our choice for /C. 
Therefore, for every eigenvalue A of $5^$^: ^ ^k, the Gershgorin circle theorem gives 

K 

\X\ = \X-0\<J2\{Vi,Vj)\<{K-l)fi. (3) 
i=i 

Since (E]) holds for every eigenvalue A of ~ and every choice of /C C {1, . . . , A^}, we 

conclude from ([2]) that 6k < (-f^ — i-e., $ is (fC, (i^ — l)/i)-RIP. This process of using the 
Gershgorin circle theorem to demonstrate RIP for deterministic constructions has become 
standard in the community [Sj [HI [16] . 

Recall that random RIP constructions support sparsity levels K on the order of for 
some a > 1. To see how well the Gershgorin circle theorem demonstrates RIP, we need to 
express fi in terms of M and N. To this end, we consider the following result: 

Theorem 4 (Welch bound [33]). Every M x N matrix with unit-norm columns has worst- 
case coherence 



N -M 
M{N -1)' 



To use this result, we consider matrices whose worst-case coherence achieves equality in 
the Welch bound. These are known as equiangular tight frames [21], which can be defined 
as follows: 

Definition 5. A matrix is said to be an equiangular tight frame (ETF) if 

(i) the columns have unit norm, 

(ii) the rows are orthogonal with equal norm, and 

(iii) the inner products between distinct columns are equal in modulus. 

To date, there are three general constructions that build several families of ETFs [161 [SS 
[M] . Since ETFs achieve equality in the Welch bound, we can further analyze what it means 
for an M X N ETF $ to be {K, {K — l)/i)-RIP. In particular, since Theorem [2] requires that 
$ be {2K, (5)-RIP for 5 < ^2 - 1, it suffices to have ^ < V2 - 1, since this implies 



^=(2A--l), = (2A--l),/^|^<^<V2-l. (4) 



That is, ETFs form sensing matrices that support sparsity levels K on the order of v M. Most 
other deterministic constructions have identical bounds on sparsity levels [21[I11IIS]- In fact, 
since ETFs minimize coherence, they are necessarily optimal constructions in terms of the 
Gershgorin demonstration of RIP, but the question remains whether they are actually RIP 
for larger sparsity levels; the Gershgorin demonstration fails to account for cancellations in 
the sub-Gram matrices •I'J^^y^;, and so this technique is too weak to indicate either possibility. 
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2.2. Spark considerations. Recall that, in order for an inversion process for ([T]) to exist, $ 
must map ii'-sparse vectors injectively, or equivalently, every subcoUection of 2K columns of 
$ must be linearly independent. This linear independence condition can be nicely expressed 
in more general terms, as the following definition provides: 

Definition 6. The spark of a matrix $ is the size of the smallest linearly dependent subset 
of columns, i.e., 

Spark($) = min I ||a;||o : -Fa; = 0, a; 7^ o|. 

This definition was introduced by Dohono and Elad fT5] to help build a theory of sparse 
representation that later gave birth to modern compressed sensing. The concept of spark is 
also found in matroid theory, where it goes by the name girth [1]. The condition that every 
subcoUection of 2K columns of $ is linearly independent is equivalent to Spark($) > 2K. 
Relating spark to RIP, suppose $ is {K, 5)-RIP with Spark($) < K. Then there exists a 
nonzero i^-sparse vector x such that 

(l-5)||a;f < ||<l>xf = 0, 

and so 5 > 1. The reason behind this stems from our necessary linear independence condition: 
RIP implies linear independence, and so small spark implies linear dependence, which in turn 
implies not RIP. 

As an example of using spark to analyze RIP, we now consider a construction that dates 
back to Seidel [27], and was recently developed further in |16j. Here, a special type of block 
design is used to build an ETF. Let's start with a definition: 

Definition 7. A (t, fc, f )-Steiner system is a v-element set V with a collection of k- element 
subsets of V , called blocks, with the property that any t- element subset of V is contained 
in exactly one block. The {0, l}-incidence matrix A of a Steiner system has entries A[i,j], 
where A[i,j] = 1 if the ith block contains the jth element, and otherwise A[i,j] = 0. 

One example of a Steiner system is a set with all possible two-element blocks. This forms 
a (2, 2, f )-Steiner system because every pair of elements is contained in exactly one block. 
The following theorem details how to construct ETFs using Steiner systems. 

Theorem 8 (Theorem 1 in [16]). Every {2, k,v)- Steiner system can be used to build a 
X t>(l + |5y) equiangular tight frame $ according the following procedure: 

(i) Let A be the x v incidence matrix of a (2, k,v) -Steiner system. 

(ii) Let H be a {1 + |5y) x (1 + I^y) (possibly complex) Hadamard matrix. 

(iii) For each j = 1, . . . ,v, let $j be a x (1 + matrix obtained from the jth 
column of A by replacing each of the one-valued entries with a distinct row of H , and 
every zero-valued entry with a row of zeros. 

(iv) Concatenate and rescale the $j 's to form $ = (^5^)2 [$]^ ■ • ■ $^]. 
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As an example, we build an ETF from a (2,2,4)-Steiner system. In this case, we make use 
of the corresponding incidence matrix A along with a 4 x 4 Hadamard matrix H: 



A 



' + 


+ 






+ 




+ 




+ 






+ 




+ 


+ 






+ 




+ 






+ 


+ 



H 



+ + + + 
+ - + - 
+ + - - 



In both of these matrices, pluses represent I's, minuses represent — I's, and blank spaces 
represent O's. For the matrix A, each row represents a block. Since each block contains 
two elements, each row of the matrix has two ones. Also, any two elements determines a 
unique common row, and so any two columns have a single one in common. To form the 
corresponding 6 x 16 ETF $, we replace the three ones in each column of A with the second, 
third, and fourth rows of H. Normalizing the columns gives the following 6 x 16 ETF: 



1 

7! 



+ - 
+ + 
+ - 



+ - + - + - 



+ 



+ + - 
+ - - 



+ - + - 

+ - + - 

+ + - - 

+ + - - 
- + + - ■ 



+ 



+ 



(5) 



It is easy to verify that $ satisfies Definition |5l Several infinite families of (2, A;, f )-Steiner 
systems are already known, and Theorem |8] says that each one can be used to build a different 
ETF. Recall from the previous subsection that Steiner ETFs, being ETFs, are optimal 
constructions in terms of the Gershgorin demonstration of RIP. We now use the notion of 
spark to further analyze Steiner ETFs. Specifically, note that the first four columns in 
are linearly dependent. As such, Spark($) < 4. In general, the spark of a Steiner ETF is 
< 1^ < \/2M (see Theorem 3 of \16\ and discussion thereafter), and so having K on the 
order of a/M is necessary for a Steiner ETF to be {K, 5)-RIP for some 6 < 1. This answers 
the closing question of the previous subsection: in general, ETFs are not RIP for sparsity 
levels larger than the order of y/M. This contrasts with random constructions, which support 
sparsity levels as large as the order of j^^i-^ for some a > 1. That said, are there techniques 
to demonstrate that certain deterministic matrices are RIP for sparsity levels larger than 
the order of a/M? 



3. Flat restricted orthogonality 

In [7], Bourgain et al. provided a deterministic construction of M x RIP matrices that 
support sparsity levels K on the order of M^/^+^ for some small value of e. To date, this 
is the only known deterministic RIP construction that breaks the so-called "square-root 
bottleneck." In this section, we analyze their technique for demonstrating RIP, but first, we 
provide some historical context. We begin with a definition: 
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Definition 9. The matrix $ has {K, ^)-restricted orthogonality (RO) if 

^y)\ < 9\\x\\ \\y\\ 

for every pair of K -sparse vectors x, y with disjoint support. The smallest 9 for which $ has 
{K,9)-R0 is the restricted orthogonality constant (ROC) 9k- 

In the past, restricted orthogonality was studied to produce reconstruction performance 
guarantees for both £i-minimization and the Dantzig selector [9l [10] . Intuitively, restricted 
orthogonality is important to compressed sensing because any stable inversion process for ([T]) 
would require $ to map vectors of disjoint support to particularly dissimilar measurements. 
For the present paper, we are interested in upper bounds on RICs; in this spirit, the following 
result illustrates some sort of equivalence between RICs and ROCs: 

Lemma 10 (Lemma 1.2 in [9J). 9k < 52k ^ 9k + 5k- 

To be fair, the above upper bound on 62K does not immediately help in estimating 62K, 
as it requires one to estimate 6k- Certainly, we may iteratively apply this bound to get 



S2K <9k + 9iK/2] + 9iK/4] +--- + 9i + 6i<{l+ [logs K])9k + 5i. 
Note that 61 is particularly easy to calculate: 



(6) 



^1 



max 

n&{l,...,N} 



Ml -1 



which is zero when the columns of $ have unit norm. In pursuit of a better upper bound 
on S2K, we use techniques from [7] to remove the log factor from IQ: 

Lemma 11. 52k < 26'^ + 5i. 

Proof. Given a matrix $ = [ipi ■ ■ ■ ipj^f], we want to upper-bound the smallest 6 for which 



[l-6)\\x\\' < \\^x\\' < (1 + 



x\ 



or equivalently: 



6 > 



l$7 



(7) 



for every nonzero 2i^-sparse vector x. We observe from ([7]) that we may take x to have unit 
norm without loss of generality. Letting K, denote a size-2ii' set that contains the support 
of X, and letting {xk}keK: denote the corresponding entries of x, the triangle inequality gives 



\^x\ 



^Xi(pi,^Xjipj ) - 1 



ieK. 



^ ^{Xiipi, Xjip.^ + ^ WXiipiW^ - 1 



< 



^^{Xi<^i,Xj<^j) + ^\\x. 



(8) 



Since EieiC 



Xi 
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1, the second term of (IHl) satisfies 



1 



< 



1 < ^ \xi\'^6i = 6i, 



(9) 



To this end, we note that for each z, j G /C 



and so it remains to bound the first term of 
with j 7^ i, the term {xiipi,Xjipj) appears in 

TCK idX jeK.\I 
\I\=K 

as many times as there are size-i^ subsets of /C which contain i but not j, i.e., i^^Zi) times. 
Thus, we use the triangle inequahty and the definition of restricted orthogonahty to get 

\ K-l ) 



XCK ieX jdKXX 
\X\=K 



< 



< 



\ K-l ) XCK. ^ ieX 



jejc\x 



/2K-2\ 
\K-l) XCJC 
\X\=K 



\X\=K 

E o-A E 



1/2 



E 

j(iK\X 



1/2 



At this point, x having unit norm imphes (^jgj l^«n^^^(Sje/c\x l^iP)^''^ — \i ^^"^ 

1 9k Ck) 01 



< 



E Ok 

\k-i) xc/c 

\X\=K 



TK 



(2K-2\ 9 
\K-l) ^ 



K) 2 



□ 



Applying both this and ([2]) to ([HD gives the result. 

Having discussed the relationship between restricted isometry and restricted orthogonality, 
we are now ready to introduce the property used in [7] to demonstrate RIP: 

Definition 12. The matrix ^ = Yp\ - ■ has (i^, ^)-flat restricted orthogonality if 

iex j&j ' 

for every disjoint pair of subsets X, JT" C {1, . . . , A^} with |X|, \ J'\ < K . 

Note that $ has {K, 6'i4')-flat restricted orthogonality (FRO) by taking x and y in Def- 
inition [3 to be the characteristic functions xx and xj^ respectively. Also to be clear, flat 
restricted orthogonality is called flat RIP in [7J; we feel the name change is appropriate 
considering the preceeding literature. Moreover, the definition of flat RIP in [7] required $ 
to have unit-norm columns, whereas we strengthen the corresponding results so as to make 
no such requirement. Interestingly, FRO bears some resemblence to the cut-norm of the 
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Gram matrix $*$, defined as the maximum value of | 'Yl,i(^xTl,j(^j{ViiVj)\ over all subsets 
X, JT" C {1, . . . , A^}; the cut-norm has received some attention recently for the hardness of 
its approximation [2]. The following theorem illustrates the utility of flat restricted orthog- 
onality as an estimate of the RIG: 

Theorem 13. A matrix with {K, 6)-flat restricted orthogonality has a restricted orthogonality 
constant 6^ which is < CO log K, and we may take C = 75. 

Indeed, when combined with Lemma [TTl this result gives an upper bound on the RIG: 
S2K < 2C9\ogK + 61. The noteworthy benefit of this upper bound is that the problem of 
estimating singular values of submatrices is reduced to a combinatorial problem of bounding 
the coherence of disjoint sums of columns. Furthermore, this reduction comes at the price of 
a mere log factor in the estimate. In [7], Bourgain et al. managed to satisfy this combinatorial 
coherence property using techniques from additive combinatorics. While we will not discuss 
their construction, we find the proof of Theorem [13] to be instructive; our proof is valid for 
all values of K (as opposed to sufficiently large K in the original [7]), and it has near-optimal 
constants where appropriate. The proof can be found in the Appendix. 

To reiterate, Bourgain et al. |7] used flat restricted orthogonality to build the only known 
deterministic construction of M x RIP matrices that support sparsity levels K on the 
order of M^/^^^ for some small value of e. We are particularly interested in the efficacy 
of FRO as a technique to demonstrate RIP in general. Gertainly, [7] shows that FRO can 
produce at least an e improvement over the Gershgorin technique discussed in the previous 
section, but it remains to be seen whether FRO can do better. 

In the remainder of this section, we will show that flat restricted orthogonality is actu- 
ally capable of demonstrating RIP with much higher sparsity levels than indicated by [7J. 
Hopefully, this realization will spur further research in deterministic constructions which 
satisfy FRO. To evaluate FRO, we investigate how well it performs with random matrices; 
in doing so, we give an alternative proof that certain random matrices satisfy RIP with high 
probability: 

Theorem 14. Construct an M x N matrix $ by drawing each of its entries independently 
from a Caussian distribution with mean zero and variance jj, take C to be the constant 

from Theorem \T3[ and set a = 0.01. Then $ has {K , ^^^^^^) -fiat restricted orthogonality 
and 61 < a6, and therefore the {2K, 6) -restricted isometry property, with high probability 
provided M > log^ log A^. 

In proving this result, we will make use of the following Bernstein inequality: 

Theorem 15 (see [HI [35]). Let {Zm}m=i be independent random variables of mean zero with 
bounded moments, and suppose there exists L > such that 



E\Z„ 



12 



E\ZJ'' < ' L^-'fc! (10) 
for every k >2. Then 



Ft 



M . M ^ 1/2- 

J2Zrn>2t(j2^\^^ 

m=l ^ m=l 



< e-* fill 
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I / M \l/2 

provided t < — f E|ZmP j 

^ m=l ^ 



Proof of Theorem Considering Lemma [TT], it suffices to show that $ has restricted or- 



thogonahty and that 6i is sufficiently smalL First, to demonstrate restricted orthogonahty, 
it suffices to demonstrate FRO by Theorem [131 so we will ensure that the following 
quantity is small: 

E^-E^i) = E (Ev^^h) (e^^h)- ^^^^ 

iGX j£j ' ■m=\ ^ teX ^ ^jeJ' ^ 

Notice that Xm '■= Fm := '^ji''^] mutually independent over all 

m = 1,...,M since X and JT" are disjoint. Also, Xm is Gaussian with mean zero and 
variance while similarly has mean zero and variance Viewed this way, f[T^ being 
small corresponds to the sum of independent random variables Zm '■= XmYm having its 
probability measure concentrated at zero. To this end. Theorem [15] is naturally applicable, 
as the absolute central moments of a Gaussian random variable X with mean zero and 
variance cx^ are well known: 



nx\^ = \ Vf^'(fc-l)!! iffcodd, 
[ a^{k — 1)!! if A; even. 

Since = XmYm is a product of independent Gaussian random variables, this gives 

Further since E|Z^|2 = may define L := 2^^!!^! to get ([TO]). Later, we will take 

6<6<\/2-l<^. Considering 

M X 1/2 



we therefore have ( ITT]) , which in this case has the form 



Pr 



E^^'E^j' 

iex jej 



>m\j\) 



1/2 



where the probability is doubled due to the symmetric distribution of X]m=i ^m- Since we 
need to account for all possible choices of X and J', we will perform a union bound. The 
total number of choices is given by 



|X| = 1|J|=1 \i 1/ \ l*^ I / \ 



2K 
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and so the union bound gives 



Pr 



$ does not have {K, 9)-FR0 < 26"^^^ N^^ = 2 exp ( + 2inog ) . (13) 



4 



Thus, Gaussian matrices tend to have FRO, and hence restricted orthogonahty by Theo- 
rem [131 this is made more precise below. 

Again by Lemma [TTl it remains to show that 6i is sufficiently small. To this end, we note 
that M||(y9„p has chi-squared distribution with M degrees of freedom, and so we can use 
another (simpler) concentration-of-measure result; see Lemma 1 of |20j : 



Pr 



Wr. 




< 2e" 



for any t > 0. Specifically, we pick 




t 



M M 



At 



M' 



and we perform a union bound over the A^ choices for ip^ 

M5' 



Pr 



5i>5' <2exp(^ ^ + logA^ 



(14) 



To summarize. Lemma [TTl the union bound. Theorem [T31 and ([T^ and ([T^ give 

'l-a)5 



Pr 



5oK > 5 



< Pr 

< Pr 

< Pr 



eK> 
eK> 



2 

'l-a)6 



or 6i > a6 
+ Pr 



$ does not have f K, 



6i > a6 

{l-a)6 
2C\ogK 



-FRO 



+ Pr 



<2exp(^- — 



M/(l -a)5\2 



V2Clogis: 



6i > a6 
Ma5 



+ 2K\ogN^ +2exp(^ ^ + logA^ 



and so M > log^ K log A^ gives that $ has {2K, 5)-RIP with high probability. □ 

We note that a version of Theorem also holds for matrices whose entries are independent 
Bernoulli random variables taking values with equal probability. In this case, one can 



again apply Theorem [15] by comparing moments with those of the Gaussian distribution; 
also, a union bound with 6i will not be necessary since the columns have unit norm, meaning 
6i = 0. 



4. Restricted isometry by the power method 

In the previous section, we established the efficacy of fiat restricted orthogonality as a 
technique to demonstrate RIP. While flat restricted orthogonality has proven useful in the 
past [7j, future deterministic RIP constructions might not use this technique. Indeed, it 
would be helpful to have other techniques available that demonstrate RIP beyond the square- 
root bottleneck. In pursuit of such techniques, we recall that the smallest 6 for which $ is 
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{K,6)-IilP is given in terms of operator norms in ([2]). In addition, we notice that for any 
self-adjoint matrix A, 

\\Ah = \\X{A)\U < \\X{A)\\,, 

where X{A) denotes the spectrum of A with muhiphcities. Let A = UDU* be the eigenvalue 
decomposition of A. When p is even, we can express || A(y4) \\p in terms of an easy-to-calculate 
trace: 

||A(A)||P = Ti[DP] = Ti[{UDUy] = Tt[AP]. 
Combining these ideas with the fact that || • ||p — ?■ || ■ ||oo pointwise leads to the following: 
Theorem 16. Given an M x N matrix $, define 

5k,, := max Tr[($;^$^-I;,)2'^]^. 

C 1 1 , . . . , i V I 
\1C\=K 

Then $ has the {K^ 5 K-q) -restricted isometry property for every q > 1. Moreover, the re- 
stricted isometry constant of ^ is approached by these estimates: limg^oo SK;q = ^k- 

Similar to fiat restricted orthogonality, this power method has a combinatorial aspect that 
prompts one to check every sub-Gram matrix of size K; one could argue that the power 
method is slightly less combinatorial, as fiat restricted orthogonality is a statement about 
all pairs of disjoint subsets of size < K. Regardless, the work; of Bourgain et al. [7] illustrates 
that combinatorial properties can be useful, and there may exist constructions to which the 
power method would be naturally applied. Moreover, we note that since 6K;q approaches 6k, 
a sufficiently large choice of q should deliver better-than-e improvement over the Gershgorin 
analysis. How large should q be? If we assume $ has unit-norm columns, taking q = I gives 

= ^^max^^Tr[(<|.;^$^ - I^)^] = ^^max^^ E I ^^•) I' ^ K{K-l)f^^ (15) 

~\IC\=K ~\K.\=K ^^'^ ^^1^ 

where /i is the worst-case coherence of $. Equality is achieved above whenever $ is an ETF, 
in which case fllSp along with reasoning similar to (jl]) demonstrates that $ is RIP with 
sparsity levels on the order of a/M, as the Gershgorin analysis established. It remains to be 
shown how 6k;2 compares. To make this comparison, we apply the power method to random 
matrices: 

Theorem 17. Construct an M x N matrix $ by drawing each of its entries independently 
from a Gaussian distribution with mean zero and variance j^, and take 6K-q to be as defined 
in TheoremlT^ Then 6K;q < 6, and therefore $ has the {K, 6) -restricted isometry property, 
with high probability provided M > ^K^^^^'^ log 

While fiat restricted orthogonality comes with a negligible penalty of log^ K in the number 
of measurements, the power method has a penalty of K^^'^. As such, the case q = 1 uses 
the order of measurements, which matches our calculation in (fT5l) . Moreover, the power 
method with q = 2 can demonstrate RIP with K^^"^ measurements, i.e., K ~ jVf 1/2+1/6^ 
which is considerably better than an e improvement over the Gershgorin technique. 
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Proof of Theorem [7^ Take t 
rem 11.13 of 1 13 1 states 



3K^ 



- (f )^/^ and pick /C C {1, . . . , A^}. Then Theo- 



Pr 



1 - 




t < a^in($yc) < ($yc) < 1 




> 1 - 2e~'''"^^. 



Continuing, we use the fact that A($j^$^) = cr($j(c)^ to get 
1 _ 2e-*^*'/2 



< Pr 

< Pr 



1 - 




+ t]] < A,ni„($^<f yc) < ^mU^l^lc) < 1 + 




+ t 



K 



K 



M 



(16) 



where the last inequahty follows from the fact that {jjY^"^ + t < 1. Since ^jc'^/c ^"^^ 
are simultaneously diagonalizable, the spectrum of ~ is given by \{(^*^(^^ — = 

A($)^$y^) — 1. Combining this with (|T6|l then gives 



Pr 




<3(,/- + t 



> 1 _ 2e-^''"'\ 



Considering Tr[y42'?]2g = ||A(y4)||2q < K^i\\\{A)\\^, we continue: 



Pr 



Trp^$^-lKn^<5 







> Pr 





< 6 



> l-2e 



From here, we perform a union bound over all possible choices of /C: 



Pr 



3/C s.t. Tr[($;^<l>^ - Ix)^^]^ > 5 



< 



K 



Pr 



Tr[(<|.;^<|.^-I^)2'?]^>5 



< 2exp ( — + i^log 



2 ' " k)' 



(17) 



Rearranging M > fiir^+Vaiogf gives K'/' < ,^,,,,1^^,.^^^/^^ < and so 

eN 



2 2V3i^V2g 
Combining (fT7|) and (ITSI) gives the result. 



2 V 9fs:i/29 



> 2/s: log 



□ 



5. Equiangular tight frames as RIP candidates 



In Section 2, we observed that equiangular tight frames (ETFs) are optimal RIP matrices 
under the Gershgorin analysis. In the present section, we reexamine ETFs as prospective RIP 
matrices. Specifically, we consider the possibility that certain classes of M x ETFs support 
sparsity levels K larger than the order of \fM. Before analyzing RIP, let's first observe some 
important features of ETFs. Recall that Definition O characterized ETFs in terms of their 
rows and columns. Interestingly, real ETFs have a natural alternative characterization. 
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Let $ be a real M xN ETF, and consider the corresponding Gram matrix Observing 
Definitional we have from (i) that the diagonal entries of are I's. Also, (iii) indicates 
that the off-diagonal entries are equal in absolute value (to the Welch bound); since $ has 
real entries, the phase of each off-diagonal entry of $*$ is either positive or negative. Letting 
H denote the absolute value of the off-diagonal entries, we can decompose the Gram matrix 
as $*$ = Iat + ^S, where S" is a matrix of zeros on the diagonal and ±l's on the off-diagonal. 
Here, S is referred to as a Seidel adjacency matrix, as S encodes the adjacency rule of a 
simple graph with i ^ j whenever S'[z, j] = —1; this correspondence originated in [3T] . 

There is an important equivalence class amongst ETFs: given an ETF $, one can negate 
any of the columns to form another ETF Indeed, the ETF properties in Definition O are 
easily verified to hold for this new matrix. For obvious reasons, $ and are called flipping 
equivalent. This equivalence plays a key role in the following result, which characterizes real 
ETFs in terms of a particular class of strongly regular graphs: 

Definition 18. We say a simple graph G is strongly regular of the form srg(f , k, X,fi) if 

(i) G has V vertices, 

(ii) every vertex has k neighbors (i.e., G is fc-regularj, 

(iii) every two adjacent vertices have A common neighbors, and 

(iv) every two non-adjacent vertices have fi common neighbors. 

Theorem 19 (Corollary 5.6 in [32]). Every real M x N equiangular tight frame with N > 
M + 1 is flipping equivalent to a frame whose Seidel adjacency matrix corresponds to the join 
of a vertex with a strongly regular graph of the form 



Conversely, every such graph corresponds to flipping equivalence classes of equiangular tight 
frames in the same manner. 

The previous two sections illustrated the main issue with the Gershgorin analysis: it ig- 
nores important cancellations in the sub-Gram matrices. We suspect that such cancellations 
would be more easily observed in a real ETF, since Theorem [T9] neatly represents the Gram 
matrix's off-diagonal oscillations in terms of adjacencies in a strongly regular graph. The 
following result gives a taste of how useful this graph representation can be: 

Theorem 20. Take a real equiangular tight frame $ with worst-case coherence fi, and let G 
denote the corresponding strongly regular graph in TheoremlWi Then the restricted isometry 
constant of $ is given by 6k = {K — l)/i for every K < uj{G) + 1, where uj{G) denotes the 
size of the largest clique in G. 

Proof. The Gershgorin analysis ([3]) gives the bound 6k < {K — l)/i, and so it suffices to 
prove 6k > (K — l)/i. Since K < uj{G) + 1, there exists a clique of size K in the join of G 
with a vertex. Let /C denote the vertices of this clique, and take Sfc to be the corresponding 
Seidel adjacency submatrix. In this case, Sk: = ~ Ja"? where ^k is the K x K matrix of 
all I's. Observing the decomposition = I-ft" + f'-SiCi it follows from that 




6k > Wk'^k - ^Kh 



WfJ'Sich = /UpA' - JA'lh = {K - l)/i. 
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which concludes the proof. □ 

This result indicates that the Gershgoin analysis is tight for all real ETFs, at least for 
sufficiently small values of K. In particular, in order for a real ETF to be RIP beyond the 
square-root bottleneck, its graph must have a small clique number. As an example, note 
that the first four columns of the Steiner ETF in ([5]) have negative inner products with 
each other, and thus the corresponding subgraph is a clique. In general, each block of an 
M X N Steiner ETF, whose size is guaranteed to be 0(a/M), is a lower- dimensional simplex 
and therefore has this property; this is an alternative proof that the Gershgorin analysis of 
Steiner ETFs is tight for K = 0{VM). 



5.1. Equiangular tight frames with flat restricted orthogonality. To find ETFs that 
are RIP beyond the square-root bottleneck, we must apply better techniques than Gersh- 
gorin. We first consider what it means for an ETF to have {K, ^)-fiat restricted orthogonality. 
Take a real ETF $ = [y^i ■ ■ ■ y^Tv] with worst-case coherence fi, and note that the correspond- 
ing Seidel adjacency matrix S can be expressed in terms of the usual {0, l}-adjacency ma- 
trix A of the same graph: ^[z, j] = 1 — 2y4[2, j] whenever i ^ j. Therefore, for every disjoint 
X, C {1, . . . , iV} with |X|, \J\ < K, we want 



1:^11:^1) 



1/2 



> 



iei jej 



/i 



iex jeJ 



2fi 



E{X,J)--\X\\J\ 



(19) 



where E{X, J) denotes the number of edges between X and J in the graph. This condition 
bears a striking resemblence to the following well-known result in graph theory: 

Lemma 21 (Expander mixing lemma [IS])- Given a d-regular graph ofn vertices, the second 
largest eigenvalue A of its adjacency matrix satisfies 



E{X,J)--\X\\J\ 
n 



<A(|x||:r|) 



1/2 



for every pair of vertex subsets X, J'. 



In words, the expander mixing lemma says that the number of edges between vertex subsets 
of a regular graph is roughly what you would expect in a random regular graph. For this 
lemma to be applicable to f|T9|) . we need the strongly regular graph of Theorem [19] to satisfy 
JT^ = ^ ~ |. Using the formula for L, it is not difficult to show that — ^\ = 0(M~^/^) 
provided = 0(M) and > 2M. Furthermore, the second largest eigenvalue of the 



|ArV2, 



and so the expander mixing lemma says the 
zUrr^Y^'^- This is a rather weak estimate for 9 



strongly regular graph will be A 

optimal ^ is < 2/iA ^ ( ^m^^ )^^^ since fi — VAf(Ar-i). 
because the expander mixing lemma does not account for the sizes of X and being < K. 
Put in this light, a real ETF that has fiat restricted orthogonality corresponds to a strongly 
regular graph that satisfies a particularly strong version of the expander mixing lemma. 
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5.2. Equiangular tight frames and the power method. Next, we try applying the 
power method to ETFs. Given a real ETF $ = [yji ■ ■ ■ y^jy]? let H := $*$ — In denote the 
"hollow" Gram matrix. Also, take E/c to be the N x K matrix built from the columns of Iat 
that are indexed by /C. Then 

Tr[($^$^ - I;,)^^] = Tt[{E*^^*^E^ - Ik)'''] = Tt[{E*^H E^)'^] = TT[{HE^E*^n 

Since Ej^E"^ = 'Ylkeic^k^ki "where 5k is the kth. identity basis element, we continue: 

2g-| 



Tr[(<|.;^<|.^-I,,)''] = Tr 



keK 



ko&K k2q-iGlC 
ko&IC k2q—\(^lC 



k2, 



q-li 



^fcl " ■ ■ ^k2q-lH^ko^ 



(20) 



where the last step used the cyclic property of the trace. From here, note that H has a zero 
diagonal, meaning several of the terms in fl20l) are zero, namely, those for which fc^+i = kg 
for some i G Z2g. To simplify (120]) . take K,^'^'^^ to be the set of 2g-tuples satisfying fc^+i ^ ki 
for every £ G ^2^: 



Tr[($^<|.^ - 1^)2^] 



(21) 



where /i is the wost-case coherence of $, and S" is the corresponding Seidel adjacency matrix. 
Note that the left-hand side is necessarily nonnegative, while it is not immediate why the 
right-hand side should be. This indicates that more simplification can be done, but for the 
sake of clarity, we will perform this simplification in the special case where q = 2; the general 
case is very similar. When g = 2, we are concerned with 4-tuples {ko, ki, k2, k^} G ]C^^\ Let's 
partition these 4-tuples according to the value taken by /cq and kg = k2- Note, for a fixed k^ 
and k2, that ki can be any value other than ko or k2, as can k^. This leads to the following 
simplification: 



j2 n^[^^'^^+i] 



EE( E S[ko,k,]S[kuk2]]( E S[k2,ks]S[ks,ko 



kQ^K. k2&K, ki^K. 

kof^ki=/=k 



k3£!C 
k2^ki^kQ 



E E E s[k,,k]s[kM 

ko&fC k2&K. k&K, 

ko^kj^k2 

2 

J2 E^t^O'^i'^t^'^o] +EE E s[ko,k]s[k,k 



ko&K. kGiC 
ky^ko 



koGJC k2&K fcS/C 

k2¥=h) koj^ky^k2 
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The first term above is K{K — 1)^, while the other term is not as easy to analyze, as we 
expect a certain degree of cancellation. Substituting this simplification into fl^T]) gives 

2n 



EE 



J2 S[ko,k]S[k,k2\ 



If there were no cancellations in the second term, then it would equal K{K — 1){K — 2)^, 
thereby dominating the expression. However, if oscillations occured as a ±1 Bernoulli random 
variable, we could expect this term to be on the order of K^, matching the order of the first 
term. In this hypothetical case, since /i < M~^/^, the parameter 5j^.2 defined in Theorem [TBI 

scales as and so M ~ K^^"^; this corresponds to the behavior exhibited in Theorem [T71 
To summarize, much like flat restricted orthogonality, applying the power method to ETFs 
leads to interesting combinatorial questions regarding subgraphs, even when q = 2. 



5.3. The Paley equiangular tight frame as an RIP candidate. Pick some prime 
p = 1 mod 4, and build an M x p matrix H by selecting the M := rows of the p x p 
discrete Fourier transform matrix which are indexed by Q, the quadratic residues modulo p 
(including zero). To be clear, the entries of H are scaled to have unit modulus. Next, take D 
to be an M X M diagonal matrix whose zeroth diagonal entry is and whose remaining 

M — 1 entries are {^Y^"^- Now build the matrix $ by concatenating DH with the zeroth 
identity basis element; for example, when p = 5, we have a 3 x 6 matrix: 



1 

5 

2 -27ri/5 
2„-27ri4/5 



1 
5 

2 -27ri2/5 

2 -27ri3/5 
5^^ 



1 

5 

2 -27ri3/5 
2^-27ri2/5 



-27ri4/5 



2 -27ri/5 
5*^ 



1 






We claim that in general, this process produces an M x 2M equiangular tight frame, which 
we call the Paley ETF [24J. Presuming for the moment that this claim is true, we have the 
following result which lends hope for the Paley ETF as an RIP matrix: 

Lemma 22. An M x 2M Paley equiangular tight frame has restricted isometry constant 
5k < 1 for all K <M. 

Proof. First, we note that Theorem 6 of pL] used Chebotarev's theorem [28] to prove that the 
spark of the M x 2M Paley ETF $ is M + 1, that is, every size-M subcollection of columns 
of $ forms a spanning set. Thus, for every K, C {1, . . . , 2M} of size < M, the smallest 
singular value of is positive. It remains to show that the square of the largest singular 



value is strictly less than 2. Let x be a unit vector for which 
the spark of $ is M + 1, the columns of span, and so 



1$ 



K 



X\ 



Then since 



1$ 



2 

K.\\2 



l<J>* IP 

\^k\\2 



x\ 



|<l>*xf < 



ml 



|$$*||2 = 2, 

where the final step follows from Definition 0(1)- (ii), which imply $$* = 21^^. □ 

Now that we have an interest in the Paley ETF $, we wish to verify that it is, in fact, an 
ETF. It suffices to show that the columns of $ have unit norm, and that the inner products 
between distinct columns equal the Welch bound in absolute value. Certainly, the zeroth 
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identity basis element is unit-norm, while the squared norm of each of the other columns is 
given by i + (M — 1)| = ^^^y^ = 1- Also, the inner product between the zeroth identity basis 

element and any other column equals the zeroth entry of that column: = {^^]^^Y^'^- 

It remains to calculate the inner product between distinct columns which are not identity 
basis elements. To this end, note that since = if and only if a = ±6, the sequence 
c Zp doubly covers Q \ {0}, and so 




This well-known expression is called a quadratic Gauss sum, and since p = 1 mod 4, its value 
is determined by the Legendre symbol in the following way: {(pn,^n') = ;^(^^'^) every 
n, n' G Zp with n ^ n', where 

+1 if A; is a nonzero quadratic residue modulo p, 

if = 0, 
— 1 otherwise. 

Having established that $ is an ETF, we notice that the inner products between distinct 
columns of ^ are real. This implies that the columns of $ can be unitarily rotated to form a 
real ETF indeed, one may take \1/ to be the M x 2M matrix formed by taking the nonzero 
rows of in the Cholesky factorization $*$ = LL^ . As such, we consider the Paley ETF to 
be real. From here. Theorem [T9] prompts us to find the corresponding strongly regular graph. 
First, we can flip the identity basis element so that its inner products with the other columns 
of $ are all negative. As such, the corresponding vertex in the graph will be adjacent to each 
of the other vertices; naturally, this will be the vertex to which the strongly regular graph is 
joined. For the remaining vertices, n ■v^ n' precisely when {^^-^) = —1, that is, when n' — n 
is not a quadratic residue. The corresponding subgraph is therefore the complement of the 
Paley graph, namely, the Paley graph [26]. In general, Paley graphs of order p necessarily 
have p = 1 mod 4, and so this correspondence is particularly natural. 

One interesting thing about the Paley ETF's restricted isometry is that it lends insight into 
important properties of the Paley graph. The following is the best known upper bound for 
the clique number of the Paley graph of prime order (see Theorem 13.14 of [6j and discussion 
thereafter), and we give a new proof of this bound using restricted isometry: 

Theorem 23. Let G denote the Paley graph of prime order p. Then the size of the largest 
clique is w(G') < ^/p. 

Proof. We start by showing u^G) -|- 1 < M. Suppose otherwise: that there exists a clique K, 
of size M + 1 in the join of a vertex with G. Then the corresponding sub-Gram matrix of the 
Paley ETF has the form = + I^)^m+i — A^Jm+Ij where jj, = is the worst-case 

coherence and Ja/+i is the (M + 1) x (M + 1) matrix of I's. Since the largest eigenvalue of 
Jm+1 is M + 1, the smallest eigenvalue of is l+p^'^^'^ -{M + l)p-^/'^ = l-|(p+l)p-^/^ 

which is negative when p > 5, contradicting the fact that positive semidefinite. 
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Since uj{G) + 1 < M, we can apply Lemma [22] and Theorem [20] to get 

1 > = {uj{G) + 1 - = (22) 

and rearranging gives tlie result. □ 

It is common to apply probabilistic and heuristic reasoning to gain intuition in number 
theory. For example, consecutive entries of the Legendre symbol are known to mimic cer- 
tain properties of a ±1 Bernoulli random variable [22]. Moreover, Paley graphs enjoy a 
certain quasi-random property that was studied in [11]. On the other hand, Graham and 
Ringrose [18j showed that, while random graphs of size p have an expected clique number 
of (1 -|- o(l))21ogp/log2, Paley graphs of prime order deviate from this random behavior, 
having a clique number > c log p log log log p infinitely often. The best known universal lower 
bound, (l/2-|-o(l)) logp/log2, is given in [12], which indicates that the random graph anal- 
ysis is at least tight in some sense. Regardless, this has a significant difference from the 
upper bound ^/p in Theorem [231 and it would be nice if probabilistic arguments could be 
leveraged to improve this bound, or at least provide some intuition. 

Note that our proof f[22|) hinged on the fact that 5t^(G)+i < 1, courtesy of Lemma [22] 
Hence, any improvement to our estimate for 5t^(G)+i would directly lead to the best known 
upper bound on the Paley graph's clique number. To approach such an improvement, note 
that for large p, the Fourier portion of the Paley ETF DH is not significatly different from 
the normalized partial Fourier matrix {^^Y^'^H; indeed, — ^^H^H/^\\2 < ^ for 

every /C C Zp of size < and so the difference vanishes. If we view the quadratic residues 
modulo p (the row indices of H) as random, then a random partial Fourier matrix serves as 
a proxy for the Fourier portion of the Paley ETF. This in mind, we appeal to the following: 

Theorem 24 (Theorem 3.2 in [23j). Draw rows from the N x N discrete Fourier transform 
matrix uniformly at random with replacement to construct an M x N matrix, and then 
normalize the columns to form $. Then $ has restricted isometry constant 6k S with 
probability 1 — £ provided j^^^ > log^ i^log A^loge"^, where C is a universal constant. 

In our case, both M and scale as p, and so picking 6 to achieve equality above gives 

C 

6^ = —K\og^K\og^p\oge-\ 
p 

Continuing as in ( l22]l . denote u = uj{G) and take K = u to get 

C' 2 1 2 1 -1 e2 

—uj log uj log phge >6^ = > — , 

p p 2p 

and then rearranging gives u/ log^ u < C" log^ploge"^ with probability 1 — e. Interestingly, 
having co/log^co = O(log^p) with high probability (again, under the model that quadratic 
residues are random) agrees with the results of Graham and Ringrose [TS]. This gives some 
intuition for what we can expect the size of the Paley graph's clique number to be, while 
at the same time demonstrating the power of Paley ETFs as RIP candidates. We conclude 
with the following, which can be reformulated in terms of both flat restricted orthogonality 
and the power method: 
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Conjecture 25. The Paley equiangular tight frame has the {K, 6) -restricted isometry prop- 
erty with some 5 < \pl — 1 whenever K < jf^, for some universal constants C and a. 

6. Appendix 

In this section, we prove Theorem [131 which states that a matrix with [K, ^)-flat restricted 
orthogonahty has 6k < C6 log K, that is, it has restricted orthogonahty. The proof below is 
adapted from the proof of Lemma 3 in [7] . Our proof has the benefit of being valid for all 
values of K (as opposed to sufficiently large K in the original [7]), and it has near-optimal 
constants where appropriate. Moreover in this version, the columns of the matrix are not 
required to have unit norm. 

Proof of Theorem [73l Given arbitrary disjoint subsets X, ^7 C {1, . . . , A^} with |X|, \J'\ < K, 
we will bound the following quantity three times, each time with different constraints on 
{xijiex and {Vjjjej. 

i&x jej 

To be clear, our third bound will have no constraints on {xjjjgj and {yj}j(^j, thereby demon- 
strating restricted orthogonality. Note that by assumption, (12^ is < ^(|X| | JT"])"^/^ whenever 
the Xi's and yj's are in {0, 1}. We first show that this bound is preserved when we relax the 
Xj's and yj^s to lie in the interval [0, 1]. 

Pick a disjoint pair of subsets X', JT' C {1, . . . , A^} with |X'|, 1^7"'! < K. Starting with some 

G X', note that fiat restricted orthogonality gives that 



(23) 



iei\{k} j&J 



< 



<m\{k}\\j\y/'<m\j\y/' 



for every disjoint X^J^ {1, . . . , A^} with |X|, \ J\ < K and k EX. Thus, we may take any 
Xk G [0, 1] to form a convex combination of these two expressions, and then the triangle 
inequahty gives 



\X\\J\f''>xu 



iex jej 



'1 



J2 v^i.^vj 

iex\{k} jeJ 



> 



Xk 



E 



Xk 1 
1, 



i = k 
i ^ k 



+ (1 -Xfc) 



i€X\{k} 



(24) 



Since fl2^ holds for every disjoint X, C {1, . . . , A^} with |X|, \J'\ < K and k G X, we can 
do the same thing with an additional index i G X' or j G JT"', and replace the corresponding 
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unit coefficient with some Xj or Uj in [0, 1]. Continuing in this way proves the claim that 
f l23|) is < ^(|X| I JT'I)"'^/^ whenever the Xj's and yj's he in the interval [0, 1]. 

For the second bound, we assume the Xj's and y/s are nonnegative with unit norm: 
Siex-^i ~ ^jejVj ~ bound f l23l) in this case, we partition X and J' according to the 

size of the corresponding coefficients: 

Xfc := {z e X : 2-('=+i) < a;, < 2-^=}, Jk := {j E J : 2'^^+^^ < y, < 2-^}. 

Note the unit-norm constraints ensure that X = IJfclo-^'^ ^^'-^ ~ UfcLo'^'^- '^^^ triangle 
inequality thus gives 



iei jej 



ki=0k2=0 



(25) 



By the definitions of X^^ and J7fc2) the coefficients of ipi and ipj in ( |25|) all lie in [0, 1]. As 
such, we continue by applying our first bound: 



oo oo 



J2x.^.,J2y,^A < $:2-('=^+^'^)^(|X,J|J.J)V^ 

j&J ' ki=0k2=0 



e(f22^'\ik\'/A(f^2-W^A 

^ A;=0 ^ ^ k=0 ^ 



We now observe from the definition of X^ that 

oo oo 
ieX fe=0 ieXfc A:=0 

Thus for any positive integer t, the Cauchy-Schwarz inequality gives 



oo t— 1 oo 

Y^2-^\X,\'I' = Y^2~^ |X, I V2 + 5^ 2-^ |X, 

A;=0 A;=t 

1/2 oo 



1/2 



fe=0 



/ i-1 \ 1/2 oo 

^ fc=0 ^ fc=t 



< 2(ti/2 ^ ^1/22- 



(26) 



(27) 



and similarly for the jT^'s. For a fixed i^, we note that (127|) is minimized when K^l'^2 * 



2 log 2 ' 



and so we pick t to be the smallest positive integer such that K^l'^2 * < 



2 log 2 • 



With 
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this, we continue fl26|) : 



iex jej 



2 log 2 



4^U + 



log 2 (21og2)2ty 



(28) 



From here, we claim that t < ■ Considering the definition of t, this is easily verified 

for /sT = 2, 3, . . . , 7 by showing < 



-1/2 



2iog2 ^ ~ Tl^yl- K > 8, one can use 
calculus to verify the second inequality of the following: 



1 /o ri£S_f2_n 1 /o '°B ^ 



log A' 1 / log K 



21og2 V log2 



-1/2 
+ 11 < 



1 



2 log 2 



logfif 



-1/2 



meaning t < [M] . Substituting t < M + 1 and t > 1 into then gives 



< 46 



logK 1 1 
+ 1 + ; ^ + 



log 2 log 2 (2 log 2) 



9{CologK + Ci 



with Co ^ 5.77, ^ 11.85. As such, ([23]) is < C'^ log fsT with C = Cq + ^ in this case. 

We are now ready for the final bound on fl23|) in which we apply no constraints on the Xj's 
and y/s. To do this, we consider the positive and negative real and imaginary parts of these 
coefficients: 

3 

a^i = E Xi^fci'' s.t. Xi^k>0 V/c, 

A:=0 

and similarly for the y/s. With this decomposition, we apply the triangle inequality to get 



3 3 

iex A;i=0 jej k2=0 
3 3/ 

fci=ofe2=o ^ iex jej 



Finally, we normalize the coefficients by {J2i<=:X^i,kJ^^'^ ^'^jejyjM^^^'^ s-PPly 
our second bound: 

\ 3 3. \ 1/2 / \ 1/2 

Y x.^. Y y^^^ ) ^YY[Y -I. Y yl. ^'e log k 

iex jGj ' fci=ofc2=o^iex ^ ^ j^J ^ 



fci=o fc2=o ^ iex 
< {Ce\ogK-)\\x\\\\y\\, 

where C = AC ~ 74.17 by the Cauchy-Schwarz inequality, and so we are done. 



□ 
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